Common institutional investors and the quality of management earnings forecasts—Empirical and machine learning evidences

Based on the data of the Chinese A-share listed firms in China Shanghai and Shenzhen Stock Exchange from 2014 to 2021, this article explores the relationship between common institutional investors and the quality of management earnings forecasts. The study used the multiple linear regression model and empirically found that common institutional investors positively impact the precision of earnings forecasts. This article also uses graph neural networks to predict the precision of earnings forecasts. Our findings have shown that common institutional investors form external supervision over restricting management to release a wide width of earnings forecasts, which helps to improve the risk warning function of earnings forecasts and promote the sustainable development of information disclosure from management in the Chinese capital market. One of the marginal contributions of this paper is that it enriches the literature related to the economic consequences of common institutional shareholding. Then, the neural network method used to predict the quality of management forecasts enhances the research method of institutional investors and the behavior of management earnings forecasts. Thirdly, this paper calls for strengthening information sharing and circulation among institutional investors to reduce information asymmetry between investors and management.


Introduction
In 2020, a Chinese firm listed on the Shenzhen Stock Exchange of China disclosed that its earnings forecast range for FY2019 was 4 million yuan to 8 million yuan; later, the management narrowed the performance forecast range from 4 million yuan to 6 million yuan. Rendong Holdings (002647) disclosed its net profit range for FY2021 as a loss of 100 million to 200 million yuan, whose difference was too wide compared to the original estimated range and received a supervisory letter from the Shenzhen Stock Exchange. For particular motives, the shareholders and management strategically release a wider earnings forecast range based on meeting the minimum regulatory requirements to obscure the accurate judgment of investors, creditors, and customers on the firm's performance [1]. This strategic disclosure behavior weakens the risk warning function of performance forecasts, causing a decline in the information content of performance forecasts and undermining the development of sustainable accounting information disclosure in the capital market [2,3]. In exploring the external factors influencing the quality of management earnings forecast, prior literature has made analysis and research mainly from analyst tracking, market competition [4], and exchange inquiry letters [5]. On the other hand, internal factors such as executive traits, compensation incentives, and shareholding structure [6] also have an important impact on earnings forecast. Institutional shareholding as a crucial external monitoring mechanism influences management voluntary disclosure [6,7].
With the upgrading of the professional asset management industry, coupled with the "do not put your eggs in the same basket" strategy of diversification, a unique ownership structure of common institutional investors in the industry has been formed, which refers to institutional shareholders holding large shares in competitive companies in the same industry and having a significant influence on company decisions. Institutional shareholders have a significant influence on corporate decisions [6,[8][9][10]. Prior literature shows that shareholders influence management's behavioral decisions, and managers are well aware of large shareholder incentives [11,12]. The sensitivity of executive compensation to performance gradually decreases when the industry is more institutionally co-owned [13]. The common institutional shareholders change the competitive preferences of the industry [14], and the management should be able to perceive such preference changes and adjust their behavioral decisions in time [15]. Studies related to common institutional shareholders focus on the following aspects to do further research on corporate competition [16][17][18][19], financing ability [20], corporate innovation [21], and corporate governance [22][23][24].
In recent years, scholars have gradually transitioned their research on institutional shareholding on performance preview disclosure from a single institution to an institutional network, generally based on the network linkage formed by multiple institutions investing in the same enterprise together [10]. As the information disclosure environment in China is semimandatory, there are specific research gaps on whether and how the unique shareholding structure of industry-shared institutional investors affects the quality of management earnings forecasts of China's listed companies. This paper's marginal contributions and implications are: First, this paper examines the impact of common institutional investors on the quality of management earnings forecast, which enriches the literature related to the economic consequences of common institutional shareholding. Secondly, we introduce the non-controlling large shareholder exit threat variable (NET) and use the neural network method to predict the precision of management forecast, enhancing the research method of institutional investors and the behavior of management earnings forecast. Finally, it provides empirical and machine learning support for information sharing and collaborative supervision among industry associations, listed companies, and investors, which helps to further standardize the corporate governance structure and promote the high-quality development of listed companies.

Theory background and hypothesis development
Earnings forecasts are forward-looking, convey macroeconomic conditions of the industry in which the firm is located, and contain proprietary information about the firm's operating conditions, market position, and risk information [25,26]. Competitors commonly obtain proprietary information from the disclosure of earnings forecasts to adjust their competitive strategies, adversely affecting the firm's competitive position in the market. Therefore, the cost of proprietary information becomes one of the main limitations of management earnings forecasts. With the gradual alleviation of the degree of competition in the product market due to common institutional investors' shareholdings in the same industry, management is less concerned that the proprietary information conveyed in the disclosure will be used by competitors to gain excess market share or profits, thus relaxing the restrictions on full disclosure, provided that the cost of proprietary information is low. The benefits remain unchanged [6]. Second, regulators require and encourage complete and accurate management earnings forecasts; the potential compliance costs create a positive incentive for managers to provide high-quality earnings forecasts to the public. Third, in shareholder-initiated governance proposals, institutional investors can vote against incompetent management or even have incompetent executives ousted outright [24]. Management has the incentive to consider the competitive preferences of shared institutional investors to avoid the risk of stepping down and to improve the quality of management earnings forecasts. Based on this, hypothesis H1 is proposed: H1: Common institutional investors significantly enhance the precision of management earnings forecasts.
Based on the above research, this paper will further explore predicting the quality of listed firms' earnings forecasts, which makes this paper more practically relevant. Institutional blockholders can govern the firms through the threat of "exit"-selling the holding shares when managers underperform [8]. Mutual funds react strongly to large shareholders' exits, leading to correlated exits that enhance corporate governance [27]. In the emerging capital market, the agent problem mainly manifests in the contradiction between the controlling and other shareholders. It is not uncommon for the controlling shareholders to encroach on the interests of other shareholders. Compared to small shareholders, non-controlling large shareholders possess many shares and specific professional skills. When intervention is ineffective, non-controlling large shareholders often release the "exit" threat as a bargaining chip with controlling shareholders. The exit threat is vital for large shareholders to achieve their governance goals [8].
With the development of machine learning technology, the application of machine learning algorithms in the securities market mainly includes the following aspects. First, analyze and predict the price fluctuation of the securities market. For example, the classical artificial intelligence support vector machine [28], neural network [29], and long-term and short-term memory networks [30] have been applied to predict stock market fluctuations. Second, analyze the effectiveness of stock evaluation indicators. For example, genetic algorithms [31], intelligent computing [32], deep learning [33], and other algorithms can effectively select and evaluate stock evaluation indicators. Third, Simulation analysis of the mechanism in the securities market, using intelligent algorithms to explain and express the stock market momentum effect [34], herding effect [35], irrational factors [36], and other abnormal phenomena. As the development direction of artificial intelligence, the deep learning model has the ability of distributed processing and can continuously learn and evolve through updating the weights in the algorithm to solve the problem of nonlinear data processing. Therefore, we will use deep learning technology to build the model to complete the prediction of the precision of management earnings forecasts for listed firms.

Data
We selected A-share listed firms in the Shanghai Stock Exchange and Shenzhen Stock Exchange of China from 2014 to 2021. The Chinese Ministry of Finance (MOF) revised and introduced nine standards in 2014, including the basic accounting standard, the principles of long-term equity investment (CAS 2), and financial instruments (CAS 22). For the sake of robustness, 2014 is selected as the starting year of the sample. We screened the samples according to the following principles: (1) exclude the financial and insurance industries; (2) exclude the ST firms; ST refers to the special treatment carried out by listed firms with unusual financial positions. (3) exclude the samples in the year of IPO; (4) exclude the samples with missing main variables in the sample period. All continuous variables are winsored at the 1% level to mitigate the effect of the ultimate value. The data on management earnings forecast used in this paper are obtained from the WIND database; data on common institutional investors, corporate finance, and corporate governance are obtained from the CSMAR database. The descriptive statistics, correlation analysis, and multiple regression analysis of the main variables in this paper are processed using STATA 17.0 and Python3.

Key measures
1. Quality of earnings forecast (PRECS). In this paper, we use the precision of earnings forecast as a proxy for the quality of earnings forecast [37], defined as the width range of management earnings forecast. When the value is zero, the forecast precision is highest. The forecast precision is calculated as the difference between the forecasts' lower and upper limits, divided by the absolute value of the estimates' mean [1].

Common institutional investors.
Drawing on existing literature [6,20,24,38,39], we use quarterly firm data and retain institutional investors with shareholdings of 5% or more (including 5%). Suppose institutional investors hold at least 5% shares in two or more other firms in the same quarter in the same industry. In that case, this indicates the existence of common institutional investors. In this paper, we construct indicators of the shareholding of common institutional investors in four dimensions: the dummy variable Coz_dum is used to indicate the existence of common institutional investors; the variable Coz_num is used to indicate the number of common institutional investors; the variable Coz_degree is used to indicate the degree of connections of common institutional investors; the variable Coz_rate is used to represent the percentage of shares held by institutional investors. Second, the threshold of 5% is chosen because 5% is the threshold for significant shareholding based on Chinese securities laws and regulations.
3. Control variables. Based on the existing literature [4][5][6]37,40], the following variables are selected as control variables in this paper, specifically: firm size (SIZE), leverage ratio (LEV), sales growth (GSALES), profitability (ROA), cash level (OCF), dual role (DUAL), board size (BRD), analyst following (ANA), management shareholding ratio (MGTSHR), executive compensation (PAY), Big 4 audit (BIG4), and voluntary disclosure (VOL). The specific variables are defined in Table 1. 4. Non-controlling Large Shareholder Exit Threat (NET). Dou et al. measure that the blockholder's exit threat is mainly affected by stock liquidity and competition among large shareholders, and they multiply stock liquidity and large shareholder competition as proxy variables for the blockholder exit threat [41]. The more liquid the stock is, the more intense the competition among large shareholders is, and the blockholder exit threat of the firm is higher [42].
Where BHC i,t is the degree of competition of non-controlling large shareholders in year t of firm i, NCLS k,i,t is the shareholding ratio of the k th non-controlling major shareholder in year t of firm i, and SSBH i,t is the sum of shareholding ratios of all major shareholders in year t of firm i. All the shareholding ratios here refer to the proportion of tradable shares. Therefore, a larger BHC i,t indicates a higher degree of competition among non-controlling large shareholders. Finally, the econometric model of NET is constructed as follows: Coz_num The number common institutional investors for each firm, then take the average and plus 1 to take the logarithm.
Coz_degree The degree of connection of institutional investors. The average number of firms in the same industry held by all co-owned institutional investors, plus 1 to take the logarithm.

Coz_rate
The sum of the quarterly shareholding ratios of common institutional investors, averaged over the year.

INSOWN
The sum of shares held by institutional investors divided by the total number of issued shares is calculated by taking the logarithmic value.

BIGOWN1
Shareholding ratio of the largest shareholder as disclosed by the listed company

Model and methodology
To verify the relationship between co-owned institutional investors and the precision of management earnings forecast, we construct the following model for OLS regression [3,6].
Suppose the regression coefficient β 1 of common institutional investors (Coz i,t ) is significantly negative; In that case, it indicates that common institutional investors narrow the width of the management earnings forecast and improve the forecasts' precision. Controls i,t is a set of control variables, andε it is an error term. The heteroskedasticity-robust standard error clustered at the firm level is used to ensure the robustness of the model [43]; the year (Year FE) and industry (Industry FE) effects are controlled to address the omitted variables that do not change over time.
In this study, the exit threat variable (NET) was introduced to further explore the forecasts precision. In order to fully interpret the trend of NET changes, this paper introduces the graph time series [28] embedding method to construct a graph with exit threat variables as a continuous time series. Embedding is shown in Fig 1; node characteristics are exit threat variables for the corresponding year, adjacent nodes are adjacent year nodes, adjacent years set edges, and edge weights are the magnitude of threat variable changes in adjacent years. Fig 2 presents the construction of the exit threat variable (NET) by graph embedding and the prediction of forecasts' precision completed with a neural network. In Fig 2, part ① first constructs the exit threat variable graph data based on the method in Fig 1. Part ② then introduces the attention mechanism, and the node and edge weight matrices are shown in Eqs (4) and (5), respectively. The attention mechanism is shown in Eq (6). d k is the dimension of the vectors w n and w e . Vector multiplication will increase the dimension. Therefore, to make the attention matrix have the characteristics of standard normal distribution, the ffi ffi ffi ffiffi d k p is introduced into the formula to process the matrix.
In order to avoid the gradient explosion and disappearance problem, the residual connection is introduced to add the weight matrix and features directly as the hidden layer input, as shown in Eqs (7) and (8).
GRU is used to further extract the features to form the feature vector matrix, and some feature information is discarded by the recurrent neural network forgetting gate. In contrast, the input gate adds new information to the node features and performs the state layer normalization process, as shown in Eqs (9) and (10).
The transformed NET concatenate the variables in Table 1 to form a matrix, and according to the matrix features, the LSTM is introduced for binary classification to predict the forecasts precision. Prediction is achieved using the activation function σ. Finally, the whole convolutional embedding process is completed by Eq (11).
The weight matrix is continuously updated through the aggregation update process. The model uses the cross-entropy loss function to complete the training process, and the model

Descriptive statistics
According to the descriptive statistics in Table 2, the mean value of PRECS in the sample firms is 0.2258, the standard deviation is 0.2153, and the maximum and minimum values are 1.3939 and 0.0000, respectively. It indicates that most firms have a narrow width of earnings forecasts and a high quality of management earnings forecast. 9.28% of the firms have common institutional investors. A firm has roughly one common institutional investor on average (the average value of Coz_num is 0.0672); the average value of the common institutional investor shareholding ratio (Coz_rate) is 2.00%, and the maximum value is 56.33%. Table 3 presents the Pearson correlation coefficients matrix of the main variables. There is a significant correlation between the indicators of independent variables (Coz_dum, Coz_num, Coz_degree, Coz_rate) and PRECS, which are all significantly negative at the 1% level. The correlation between common shareholders and higher PRECS preliminarily validates the research hypothesis in Section 2. In addition, there is a significant correlation between the control variables and PRECS. The correlation coefficient between no variables exceeds 0.5, and the value of each correlation coefficient is relatively small. There is no severe multicollinearity between the variables of the model. Table 4 presents the results of the empirical test of common institutional investors and the quality of performance forecasts. The results in columns (1)-(4) do not consider control

PLOS ONE
Common institutional investors and the quality of management earnings forecasts variables. The results in columns (5)-(8) are added to each control variable. According to columns (5)-(8), common institutional investors (Coz) are positively correlated with performance forecast accuracy (PRECS). The estimated coefficients of Coz_dum, Coz_num, Coz_degree, and Coz_rate are -0.0172 (t = -1.9772), -0.0219 (t = -1.8008), -0.0121 (t = -2.0226), and -0.0821 (t = -2.2928), respectively, and are at the 5%, 10%, 5%, and 5% statistical levels of significance. It indicates that the common institutional investors narrowed the width range of management earnings forecast on average and improved the precision of management forecasts, as verified by H1. The finding is consistent with prior studies that common ownership increases in disclosure [6,[44][45][46].      Fig 3(A) shows the prediction results of the LSTM method without introducing the exit threat variable (NET). At the same time, Fig 3(B) presents the prediction results of the LSTM method with the introduction of the threat variable (NET) but with the threat variable as a class of features only. Fig 3(C), on the other hand, shows the prediction results using the embedding method of this paper to represent the threat exit variables as shown in the figure. The results from the density curves and the MAE, MSE, and RMSE metrics can be seen. The graph embedding method proposed in this paper to construct threat exit variables can provide better explanatory effects in the deep learning model.

Robust tests
This paper adopts the following methods for robustness tests. First, we consider endogeneity tests. The industry mean for Coz and lagged three-period data for Coz are selected as instrumental variables. The two-stage least squares (2SLS) and Gaussian mixed model (GMM) are used to conduct instrumental variables tests on the above issues. The second is to replace the measure of the explanatory variable PRECS. PRECS2 is calculated as the absolute value of the difference between the upper and lower limit of earnings forecast, divided by the total asset balance at the beginning of the year. When the value of PRECS2 is equal to 0, the higher the precision and the higher the quality of management earnings forecast disclosure. The results in Tables 5 and 6 show that the results obtained from the two robustness tests are generally consistent with the results of the benchmark regression, with no significant changes in the coefficients and significance levels of the explanatory variables, making the benchmark regression results robust.

Conclusion
The findings of this paper are as follows: Common institutional investors help to improve the precision of management's earnings forecasts. It provides valuable insights into the role of

Step1-Coz_degree Step2-PRECS-2SLS
Step2-PRECS-GMM institutional investors in shaping corporate decision-making and financial reporting practices, which has implications for investors, analysts, regulators, and corporate governance practitioners. Based on this, we propose the following suggestions: First, common institutional shareholders help to obtain more critical financial information from enterprises in the same industry, create a platform for information sharing, and reduce supervision costs for institutional participation in corporate governance due to their unique advantages in capital size, specific industry knowledge, and information collection and analysis. Secondly, listed companies should be encouraged to voluntarily disclose institutional shareholdings and give full play to information intermediaries' monitoring and governance roles. Finally, the industry association should strengthen the communication between common institutional shareholders and the management of listed companies, further create an institutional environment for institutional investors to participate in corporate governance and play synergistic advantages, help improve and perfect the performance forecasts disclosure system, and realize mutual promotion and resource sharing among companies, investors and the industry.
Supporting information S1 File.

(ZIP)
Author Contributions